Observing Climate Change from Space

The Advanced SCATterometer (ASCAT) (Source: ESA)

10.1 Overview

In this notebook we will examine the capabilities of Advanced Scatterometer (ASCAT) to monitor droughts in Mozambique. ASCAT instruments are situated onboard the Metop satellites (EUMETSAT1) that are in orbit around the Earth. Since 2007, these missions have yielded a continuous record of microwave backscattering and continue to produce data for the future. The longevity of the ASCAT microwave backscatter record is therefore well-suited to track changes related to climate change, such as, El Niño induced rainfall patterns over Mozambique. The surface soil moisture (SSM) retrieved from the product showcased here is available at a sampling distance of 6.25\(\,\)km, this means that one value of soil moisture is available for every 50\(\,\)km\(^2\) (5000\(\,\)ha).

More information about microwave backscattering and the fundamentals of surface soil moisture retrieval from microwave backscatter signatures can be found here:

10.2 Imports

import cartopy.crs as ccrs
import holoviews as hv
import hvplot.pandas  # noqa
import matplotlib
import numpy as np
import pandas as pd
from bokeh.models import FixedTicker

10.3 Surface Soil Moisture of Mozambique (2007 - 2025)

Let us start by having a look at monthly aggregated SSM derived from ASCAT microwave backscattering over Mozambique. We can easily load the csv-file with pandas and then plot the results with hvplot. This creates an interactive plot whereby we can map the SSM values on an Open Street Map (OSM) and scroll through all months from 2007 to the present. For convenience, we added the locations of the in-situ sensors placed for each target district in the DrySAT project.

ROOT = "https://git.geo.tuwien.ac.at/api/v4/projects/1266/repository/files/"


def make_url(file, lfs="true"):
    return ROOT + file + f"/raw?ref=main&lfs={lfs}"


df = pd.read_csv(
    make_url("ascat-6_25_ssm_monthly.csv"), index_col="time", parse_dates=True
)


def load_cmap(file):
    def to_hex_str(x):
        return f"#{int(x.R):02x}{int(x.G):02x}{int(x.B):02x}"

    path = r"colour-tables%2Fssm-continuous.ct"
    df = pd.read_fwf(make_url(path, "false"), names=["R", "G", "B"], nrows=200)
    brn_yl_bu_colors = df.apply(to_hex_str, axis=1).to_list()
    return matplotlib.colors.LinearSegmentedColormap.from_list("", brn_yl_bu_colors)


SSM_CMAP = load_cmap("colour-tables/ssm-continuous.ct")

locations = {
    "Muanza": {"latitude": -18.9064758, "longitude": 34.7738921},
    "Chokwé": {"latitude": -24.5894393, "longitude": 33.0262595},
    "Mabote": {"latitude": -22.0530427, "longitude": 34.1227842},
    "Mabalane": {"latitude": -23.4258788, "longitude": 32.5448211},
    "Buzi": {"latitude": -19.9747305, "longitude": 34.1391065},
}

df_locs = pd.DataFrame.from_dict(locations, "index").reset_index()

points = df_locs.hvplot.points(
    x="longitude", y="latitude", color="black", crs=ccrs.PlateCarree()
)
labels = df_locs.hvplot.labels(
    x="longitude",
    y="latitude",
    text="index",
    text_baseline="bottom",
    text_color="black",
    crs=ccrs.PlateCarree(),
)

df.hvplot.points(
    x="longitude",
    y="latitude",
    c="surface_soil_moisture",
    groupby="time",
    x_sampling=0.08,
    y_sampling=0.08,
    rasterize=True,
    crs=ccrs.PlateCarree(),
    tiles=True,
    cmap=SSM_CMAP,
    clim=(0, 100),
    frame_width=500,
    clabel="Surface soil moisture (%)",
) * points * labels

10.4 Surface Soil Moisture Timeseries

Now let us have a closer look at the five locations marked on the SSM map and plot the SSM values against time for these 5 locations—known as timeseries. To do this we will use the full dataset filtered down to the five locations. This SSM dataset is thus not aggregated monthly and shows the full temporal resolution of the product. To visualize this, we highlight the density of data points falling in a certain sector of the plot with blue shading—bluer values mark a higher density of data points.

ts = pd.read_csv(
    make_url("ascat-6_25_ssm_timeseries.csv"), index_col="time", parse_dates=True
)

p = hv.Layout(
    [
        ts[ts.name == i].hvplot.scatter(
            x="time",
            y="surface_soil_moisture",
            title=i,
            rasterize=True,
            dynspread=True,
            threshold=1,
            frame_width=800,
            padding=(0.01, 0.1),
        )
        for i in ts.name.unique()
    ]
).cols(1)
p

The cyclical pattern of seasonal changes from dry to wet seasons can be easily discerned from the timeseries. Note, however, that we do not track precipitation, but the change from wet to dry soils. Moreover, we can see that the cyclical pattern breaks down on occasion as can be seen in the years 2015 and 2016, where especially Chokwé seems to display a complete lack of wetter soils during the 2016 rainy season. We can remove some of the noise in the records by aggregating the values on a monthly basis, as can be seen in the following code chunk. Here, the pandas dataframe method groupby can group the timeseries for all successive months with the pandas function Grouper(freq="ME") and the location name.

ts_monthly = (
    ts.groupby([pd.Grouper(freq="ME"), "name"]).mean().reset_index(level=["name"])
)
ts_monthly.hvplot.line(
    x="time",
    y="surface_soil_moisture",
    by="name",
    frame_width=800,
    padding=(0.01, 0.1),
)

In these aggregated timeseries we can more easily see differences in the averages and amplitudes between the locations. These differences in the average value can be an artefact of looking at relative differences in soil moisture content or can reflect actual differences, where locations are generally wetter than other regions. However, the amplitude, the magnitude of change in SSM over time is of higher importance for drought detection as we want to see if a drop in SSM during a dry season is “normal” when compared to other years, or more “extreme”

10.5 Normalization and Anomaly Detection

To filter out these differences between locations, and to emphasize how unusual certain dry periods are, we calculate Z score statistics.

\[ z_i = \frac{x_i - \bar{x}}{s^x} \]

The Z score statistic is an approach to detect anomalies in timeseries, where one measures how far a datapoint is removed from the mean. This distance from the mean by itself is not all that useful, as it depends on the location’s average SSM. To circumvent, and to more easily compare timeseries of different locations, we divide the distance of the mean with a measure of variation of the timeseries. This normalization step can be seen below, where the \(x\) value on the right-hand figure can be translated as: “this point is so many standard deviations removed from the mean”. In the code chunk below, we show how to calculate the z score statistics over the whole of a simulated timeseries by defining our own function zscore.

def zscore(x):
    return (x - x.mean()) / x.std()


np.random.seed(42)  # make example reproducible
mu, sigma = 50, 10  # mean and standard deviation
random_ts = pd.Series(np.random.normal(mu, sigma, 100))
(random_ts.hvplot.hist() + zscore(random_ts).hvplot.hist()).opts(shared_axes=False)

10.6 Drought Anomaly Detection

For our real time series, we will take it one step further by calculating the average for all months of January and comparing this to the monthly totals. This approach is repeated for all months. Execute the following code chunk to apply this approach. This is like the previous groupby operation but now we transform the panda’s column and use the datetime accessor month to accumulate monthly averages for the whole timeseries.

ts_zscore = ts_monthly.groupby(
    [ts_monthly.index.month, "name"]
).surface_soil_moisture.transform(zscore)
ts_zscore = ts_monthly.assign(zscore=ts_zscore)
ts_zscore.hvplot.line(
    x="time",
    y="zscore",
    by="name",
    frame_width=800,
    padding=(0.01, 0.1),
)

In the last plot we can now clearly discern the drought of 2015/2016 but also other droughts, such as during the years 2019/2020. The z score also appears to indicate more than usual drought in 2024/2025.

10.7 Monitoring Drought in Time and Space

As a last step, we can now apply this approach to the whole of Mozambique. Here we have already calculated the Z scores and we just have to plot them.

colorbar_opts = {
    "major_label_overrides": {
        -4: "Extreme",
        -3: "Severe",
        -2: "Moderate",
        -1: "Mild",
        0: "Normal",
    },
    "ticker": FixedTicker(ticks=[-4, -3, -2, -1, 0]),
}

df[df.zscore <= 0].hvplot.points(
    x="longitude",
    y="latitude",
    c="zscore",
    groupby="time",
    x_sampling=0.08,
    y_sampling=0.08,
    rasterize=True,
    crs=ccrs.PlateCarree(),
    tiles=True,
    cmap="reds_r",
    clim=(-4, 0),
    frame_width=500,
).opts(hv.opts.Image(colorbar_opts={**colorbar_opts})) * points * labels

This temporospatial analysis (in time and space) confirms that 2015/2016 was particularly pronounced in the south of the country surrounding the region of Chokwé. But this intense drought was also prevalent in the northern districts neighbouring Malawi. This is something that would not be seen in a spot-wise analysis.

In the next notebooks, we will compare this microwave-based technique to other indicators of drought such as SPEI and vegetation-based indicators of drought (NDVI).


  1. European Organisation for the Exploitation of Meteorological Satellites↩︎